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Abstract 

The synthesis problem asks for the automatic construction of a system from its specification. In the 
traditional setting, the system is “constructed from scratch” rather than composed from reusable components. 
However, this is rare in practice, and almost every non-trivial software system relies heavily on the use of 
libraries of reusable components. Recently, Lustig and Vardi introduced dataflow and controlflow synthesis 
from libraries of reusable components. They proved that dataflow synthesis is undecidable, while conttolflow 
synthesis is decidable. The problem of controlflow synthesis from libraries of probabilistic components was 
considered by Nain, Lustig and Vardi, and was shown to be decidable for qualitative analysis (that asks 
that the specification be satisfied with probability 1). Our main contributions for conttolflow synthesis from 
probabilistic components are to establish better complexity bounds for the qualitative analysis problem, and to 
show that the more general quantitative problem is undecidable. For the qualitative analysis, we show that the 
problem (i) is EXPTIME-complete when the specification is given as a deterministic parity word automaton, 
improving the previously known 2EXPTIME upper bound; and (ii) belongs to UP fl coUP and is parity- 
games hard, when the specification is given directly as a parity condition on the components, improving the 
previously known EXPTIME upper bound. 

1 Introduction 

Synthesis from existing components. Reactive systems (hardware or software) are rarely built from scratch, but are 
mostly developed based on existing components. A component might be used in the design of multiple systems, 
e.g., function libraries, web APIs, and ASICs. The construction of systems from existing reusable components is 
an active research direction, with several important works, such as component-based construction [24], “interface- 
based design” [17], web-service orchesttation [5]. The synthesis problem asks for the automated construction of a 
system given a logical specification. For example, in LTL (linear-time temporal logic) synthesis, the specification 
is given in LTL and the reactive system to be constructed is a finite-state transducer [23]. In the traditional LTL 
synthesis setting, the system is “constructed from scratch” rather than “composed” from existing components. 
Recently, Lustig and Vardi introduced the study of synthesis from reusable or existing components [20]. 

The model and types of composition. The precise mathematical model for the components and their composition 
is an important concern (and we refer the reader to [20, 21] for a detailed discussion). As a basic model for 
a component, following [20], we abstract away the precise details of the component and model a component 
as a transducer, i.e., a finite-state machine with outputs. Transducers constitute a canonical model for reactive 
components, abstracting away internal architecture and focusing on modeling input/output behavior. In [20], 
two models of composition were studied, namely, dataflow composition, where the output of one component 
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becomes an input to another component, and controlflow composition, where at every point of time the control 
resides within a single component. The synthesis problem for dataflow composition was shown to be undecidable, 
whereas the controlflow composition was shown to be decidable [20]. 

Synthesis for probabilistic components. While [20] considered synthesis for non-probabilistic components, the 
study of synthesis for controlflow composition for probabilistic components was considered in [21]. Probabilistic 
components are transducers with a probabilistic transition function, that corresponds to modeling systems 
where there is probabilistic uncertainty about the effect of input actions. Thus the controlflow composition 
for probabilistic transducers aims at construction of reliable systems from unreliable components. There is a rich 
literature about verification and analysis of such systems, cf. [25, 15, 16, 26, 4, 19], as well as about synthesis in 
the presence of probabilistic uncertainty [3]. 

Qualitative and quantitative analysis. There are two probabilistic notions of correctness, namely, the qualitative 
criterion that requires the satisfaction of the specification with probability 1, and the more general quantitative 
criterion that requires the satisfaction of the specification with probability at least r/, given 0 < r/ < 1. 

The synthesis questions and previous results. In the synthesis problem for controlflow composition, the input is 
a library L of probabilistic components, and we consider specifications given as parity conditions (that allow us 
to consider all tu-regular properties, which can express all commonly used specifications in verification). The 
qualitative (resp., quantitative) realizability and synthesis problems ask whether there exists n finite system S 
built from the components in C, such that, regardless of the input provided by the external environment, the 
traces generated by the system S satisfy the specification with probability 1 (resp., probability at least if). Each 
component in the library can be instantiated an arbitrary number of times in the construction and there is no a- 
priori bound on the size of the system obtained. The way the specification is provided gives rise to two different 
problems: (i) embedded parity realizability, where the specification is given in the form of a parity index on the 
states of the components; and (ii) DPW realizability, where the specification is given as a separate deterministic 
parity word automaton (DPW). The results of [21] established the decidability of the qualitative realizability 
problem, namely, in EXPTIME for the embedded parity realizability problem and 2EXPTIME for the DPW 
realizability problem. The exact complexity of the qualitative problem and the decidability and complexity of the 
quantitative problem were left open, which we study in this work. 



Qualitative 

Quantitative 

Our Results 

Previous Results 

Our Results 

Previous Results 

Embedded Parity 
(with exit control) 

UP n coUP 
(Parity-games hard) 

EXPTIME 

UP n coUP 
(Parity-games hard) 

Open 

Embedded Parity 
(unrestricted exit control) 

PTIME 

Not considered 

PTIME 

Not considered 

DPW Specifications 

EXPTIME-c 

2EXPTIME 

Undecidable 

Open 


Table 1: Computational complexity of synthesis from probabilistic components. 

Our contributions. Our main contributions are as follows (summarized in Table 1). 

1. We show that both the qualitative and quantitative realizability problems for embedded parity lie in UP fl 
coUP, and even the qualitative problem is at least parity-games hard (the parity-games problem also belongs 
to UP n coUP [18], and the existence of a polynomial-time algorithm is a major and long-standing open 
problem). Moreover, we show that a special case of the quantitative embedded parity problem (namely, 
unrestricted exit control) can be solved in polynomial time: a probabilistic component has a set of exits, 
and in general there is a constraint on the current exit and the next component to transfer the control, and 
in the unrestricted exit control problem no such constraint is imposed. 




2. We show that the qualitative realizability prohlem for DPW speeifieations is EXPTIME-eomplete (an 
exponential improvement over the previous 2EXPTIME result). Einally, we show that the quantitative 
realizability problem for DPW speeifieations is undeeidable. 

Technical contributions. Our two main teehnieal eontributions are as follows. Eirst, for the realizability of 
embedded parity speeifieations, while the most natural interpretation of the problem is as a partial-observation 
stoehastie game (as also eonsidered in [21]), we show that the problem ean be redueed in polynomial time to a 
perfeet-information stoehastie game. Seeond, for the realizability of DPW speeifieations, we eonsider partial- 
observation stoehastie games where the strategies eorrespond to a eorreet eomposition that defines, given an 
exif sfafe of a eomponenf, fo whieh eomponenf fhe eonfrol should be Iransferred. Sinee we aim af a finile- 
sfafe sysfem, we need fo eonsider sfrafegies wifh finite memory, and sinee fhe eonfrol flow is deferminisfie, 
we need fo eonsider pure (non-randomized) sfrafegies. Moreover, sinee fhe eomposifion musf be independenf 
of fhe infernal exeeufions of fhe eomponenfs, we need fo eonsider sfrafegies wifh stuttering invarianee. Eor 
example, we eonsider sfuffer-invarianf sfrafegies fhaf musf play fhe same when observafions are repealed, and 
eollapsed sluller-invarianl sfrafegies fhaf are sluller-invarianl sfrafegies bul nol allowed fo observe fhe lenglh of 
fhe repelilions. We presenl polynomial-lime reduelions for bolh sluller-invarianl and eollapsed sluller-invarianl 
sfrafegies fo games wifh slandard observalion-based sfrafegies. Our resulls eslablish oplimal eomplexify resulls 
for qualilalive analysis of parlial-observalion sloehaslie games wifh finile-memory sluller-invarianl and eollapsed 
sluller-invarianl sfrafegies, whieh are of independenf inleresl. Einally, we presenl a polynomial reduelion of fhe 
qualilalive realizabilily for DPW speeifiealions fo parlial-observalion sloehaslie games wifh eollapsed sluller- 
invarianl sfrafegies and oblain fhe EXPTIME-eomplele resull. 

2 Definitions 

Probability distributions. A probability distribution on a finite set X is a funetion / : X —>■ [0,1] sueh that 
/(®) ~ to denote the set of all probability distributions on set X. 

2,1 Transducers In this section we present the definitions of deterministic and probabilistic transducers, and 
strategies for them. 

Deterministic transducers. A deterministic transducer is a tuple B = {Ej, Eq, Q, qo, S, L), where: Ej is a finite 
input alphabet, Eq is a finite output alphabet, Q is a finite set of states, qo € Q is an initial state, L : Q ^ Eq 
is an output function labeling states with output letters, and <5 : Q x Q is a transition function. We define 

5* : > Q as follows: (5*(e) = qo and for all x G and a G Ej, we have 5*{x ■ a) = 5{5*{x), a). 

Probabilistic transducers, A probabilistic transducer, is a tuple T = {Ei,Eo,Q,qo,S, F, L), where: Ej is 
a finite input alphabet, Eq is a finite output alphabet, Q is a finite set of states, go £ Q is an initial state, 
6 : {Q\F) xEj ^ ^{Q) is a probabilistic transition function, F C Q is a set of exit states, and L : Q ^ Eq is 
an output function labeling states with output letters. Note that there are no transitions out of an exit state. If F 
is empty, we say T is a probabilistic transducer without exits. Note that deterministic transducers can be viewed 
as a special case of probabilistic transducers. 

Strategies for transducers. Given a probabilistic transducer M = {Ej,Eo,Q,qo,6, F, L), a strategy for M 
is a function / : > T>{Ei) that probabilistically chooses an input for each finite sequence of states. We 

denote by F the set of all strategies. A strategy is memoryless if the choice depends only on the last state in the 
sequence. A memoryless strategy can be written as a function g : Q ^ V{Ei). A strategy is pure if the choice 
is deterministic. A pure strategy is a function h : > S/, and a memoryless and pure strategy is a function 

h : Q ^ Ej. 

Probability measure. A strategy / along with a probabilistic transducer M, with set of states Q, induces a 
probability distribution on Q‘^, denoted p,f. By standard measure-theoretic arguments, it suffices to define pf for 
the cylinders of Q‘^, which are sets of the form ■ Q^, where G Q*. Eirst we extend 6 to exit states as follows: 
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for a ^T,j and q ^ F, q' e Q, let 5{q, a){q) = 1 and S{q, a){q') = 0 if q' q. Then we define /i/((?o • Q‘^) = 1> 
and for /3 G Q*,q,q' G we have Hf{Pqq' ■ Q^) = Hf{f3q) • '^){q'))- These eonditions 

say that there is a unique start state, and the probability of visiting a state q', after visiting /3q, is the same as the 
probability of the strategy pieking a partieular letter multiplied by the probability that the transdueer transitions 
from q to q' on that input letter, summed over all input letters. 

2.2 Library of Components A library is a set of probabilistie transdueers that share the same input and output 
alphabets. Eaeh transdueer in the library is ealled a component type. Given a finite set of direetions D, we say a 
library L has width D, if eaeh eomponent type in the library has exaetly \D\ exit states. Sinee we ean always add 
dummy unreaehable exit states to any eomponent, we assume, w.l.o.g., that all libraries have an assoeiated width, 
usually denoted D. In the eontext of a partieular eomponent type, we often refer to elements of D as exits, and 
subsets of D as sets of exits. 

2.3 Controlflow Composition from Libraries We first informally deseribe the notion of eontrolflow eompo- 
sition of eomponents from a library as defined in [21]. The eomponents in the eomposition take turns interaeting 
with the environment, and at eaeh point in time, exaetly one eomponent is aetive. When the aetive eomponent 
reaehes an exit state, eontrol is transferred to some other eomponent. Thus, to define a eontrolflow eomposition, 
it suffiees to name the eomponents used and deseribe how eontrol should be transferred between them. We use a 
deterministie transdueer to define the transfer of eontrol. Eaeh library eomponent ean be used multiple times in a 
eomposition, and we treat these oeeurrenees as distinet component instances. We emphasize that the eomposition 
ean eontain potentially arbitrarily many instanees of eaeh eomponent type inside it. Thus, the size of the eompo¬ 
sition, a priori, is not bounded. Note that our notion of eomposition is static, where the eomponents ealled are 
determined before run time, rather than dynamic, where the eomponents ealled are determined during run time. 

Eet £ be a library of width D. A composer over £ is a deterministie transdueer C = {D, £, Ad, Mq, A, A). 
Here Ad is an arbitrary finite set of states. There is no bound on the size of Ad. Eaeh Mj G Ad is a eomponent 
from £ and A(Mj) G £ is the type of Mj. We use the following notational eonvention for eomponent instanees 
and names: the upright letter M always denotes eomponent names (i.e., states of a eomposer) and the italieized 
letter M always denotes the eorresponding eomponent instanees (i.e., elements of £). Eurther, for notational 
eonvenienee we often write Mj direetly instead of A(Mj). Note that while eaeh Mj is distinet, the eorresponding 
eomponents Mj need not be distinet. Eaeh eomposer defines a unique eomposition over eomponents from £. The 
eurrent state of the eomposer eorresponds to the eomponent that is in eontrol. The transition funetion A deseribes 
how to transfer eontrol between eomponents: A(M, i) = M' denotes that when the eomposition is in the fth final 
slate of eomponent M it moves to the start state of eomponent M'. A eomposer ean be viewed as an implieit 
representation of a eomposition. We give an explieit definition of eomposition below. 


Definition 1. (Controlflow Composition) Let C = {D,C,M, Mq, A, A) be a composer over library 
C of width D, where Ad = {Mo,...,M„}, A(Mj) = {T,j,F,o,Qi,qQ,Si,Fi,Li) and Fi = {qj, : x G D}. 
The composition defined by C, denoted Tc, is a probabilistic transducer {T,j,F,o,Q,qo,S,$, L), where Q = 
ur=o(Q* ^ Qo = (9o) 0). L{{q, i)) = Lfiq), and the transition function S is defined as follows: For a G S/, 
{q,i) G Q and {q',j) G Q, 


1. If q^Qi\ Fi, then 




5i{q,a){q') ifi = j 
0 otherwise 



2. Ifq = g* G Fi, where A(Mj,x) = M^, then 




1 if j = k and q' = qQ 
0 otherwise 


Note that the eomposition is a probabilistie transdueer without exits. When the eomposition is in a state 
{q,i) eorresponding to a non-exit state q of eomponent Mi, it behaves like Mi. When the eomposition is in a 
state {qf,i) eorresponding to an exit state qf of eomponent Mi, the eontrol is transferred to the start state of 
another eomponent as determined by the transition funetion of the eomposer. Thus, at eaeh point in time, only 
one eomponent is aetive and interaeting with the environment. 

2.4 Parity objectives and values for probabilistic transducer An index function for a transducer is a function 
that assigns a natural number, called a priority index, to each state of the transducer. An index function a defines 
a parity objective <l>a that is the subset of that consists of the set of infinite sequence of states such that 
the minimum priority that is visited infinitely often is even. Given a probabilistic transducer T and a parity 
objective <h, the value of the probabilistic transducer for the objective, denoted as val(T, ‘h), is infygj-fi/(^). 
In other words, it is the minimal probability with which the parity objective is satisfied over all sfrafegies in fhe 
fransducer. 


2.5 The synthesis questions In this work we consider two types of synthesis questions for controlflow 
composition. In the first problem (namely, synthesis for embedded parity) the parity objective is specified 
directly on the state space of the library components, and in the second problem (namely, synthesis from DPW 
specifications) the parity objective is specified by a separate deterministic parity automaton. 

2.5.1 Synthesis for Embedded Parity We first consider an index function that associates to each state of the 
components in the library a priority, and a specification defined as a parity condition over the sequence of visited 
states. 

Exit control relation. Given a library E of width D, an exit control relation is a set R C D x C. We say that 
a composer C = {D,C,A4, Mq, A, A) is compatible with R, if the following holds: for all M, M' G Ad and 
i G D, if A(M, i) = M' then (t, M') G R. Thus, each element of R can be viewed as a constraint on how the 
composer is allowed to connect components. An exit control relation is non-blocking if for every i £ D there 
exists a component M £ C such that (i, M) £ R (i.e., every exit has at least one possible component for the next 
choice). For technical convenience we only consider non-blocking exit control relations. R = D x C (i.e., 
there is no constraint on the composer to connect components), then we refer to the relation as unrestricted exit 
control relation. 

Definition 2. (Embedded parity realizability and synthesis.) Consider a library C of width D, an 
exit control relation Rfor L, and an index function afar the components in L that defines the parity objective 
The qualitative (resp., quantitative) embedded parity realizability problem is to decide whether there exists a 
composer C over C, such that C is compatible with R, and val(7c, = 1 (resp., val(7b, a given 

rational threshold rj £ (0,1)). A witness composer for the qualitative problem is called an almost-sure composer, 
and for the quantitative problem is called an q-optimal composer. The corresponding embedded parity synthesis 
problems are to find such a composer C if it exists. 

2.5.2 Synthesis for DPW Specifications A deterministic parity automaton (DPW) is a deterministic transducer 
where the labeling function is an index function that defines a parity objective. Given a DPW A, every word 
(infinite sequence of input letters) induces a run of the automaton, which is an infinite sequence of states, and 
the word is accepted if the run satisfies the parity objective. The language La of a DPW A is the set of words 
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accepted by A. Let A be a deterministic parity automaton (DPW), M be a probabilistic transducer and £ be a 
library of components. We say A is a monitor for M (resp. C) if the input alphabet of A is the same as the output 
alphabet of M (resp. £). Let ^ be a monitor for M and let La be the language accepted by A. The value of M 
for A, denoted as val(M, ^), is inf jgj-Note that the compatibility of the composer with an exit 
control relation can be encoded in the DPW (without loss of generality, we do not allow two distinct exit states 
to have the same output). 

Definition 3. (DPW realizability and synthesis.) Consider a library C and a DPW A that is a 
monitor for L. The qualitative (resp., quantitative) DPW probabilistic realizability problem is to decide whether 
there exists a composer C over L, such that va I (Tc,A) = 1 ( resp., vaI (7b ,A)>ri, for a given rational threshold 
r] G (0,1)). A witness composer for the qualitative problem is called an almost-sure composer, and for the 
quantitative problem is called an rj-optimal composer. The corresponding DPW probabilistic synthesis problems 
are to find such a composer C if it exists. 

Remark 1. We remark that the realizability problem for libraries with components can be viewed as a 2-player 
partial-observation stochastic parity game. Informally, the game can be described as follows: the two players are 
the composer C and the environment E. The C player chooses components and the E player chooses sequence 
of inputs in the components chosen by C. However, C cannot see the inputs of E or even the length of the time 
inside a component. At the start C chooses a component M from the library L. The turn passes to E, who 
chooses a sequence of inputs, inducing a probability distribution over paths in M from its start state to some exit 
X in D. The turn then passes to C, which must choose some component M' in L and pass the turn to E and so 
on. A^ C cannot see the moves made by E inside M, the choice of C cannot be based on the run in M, but only 
on the exit induced by the inputs selected by E and previous moves made by C. So C must choose the same next 
component M' for different runs that reach exit x of M. In general, different runs will visit different priorities 
inside M. This is a two-player stochastic parity game where one of the players (namely the composer C) does not 
have full information. However, there is also a crucial difference from traditional partial-observation games, as 
the composer does not even see the number of steps executed inside a component. IfC has a winning strategy that 
requires finite memory, then such a strategy would yield a suitable finite composer to satisfy the parity objective 
defined by the index function a, thus solving the synthesis problem. 

3 The Complexity of Realizability for Embedded Parity 

In this section we will establish the results for the complexity of realizability for embedded parity. In [21] 
the problem was interpreted as a partial-observation game (see Remark 1). While the natural interpretation of 
the embedded parity problem is a partial-observation game, we show how the problem can be interpreted as a 
perfect-information stochastic game. 

3.1 Perfect-information Stochastic Parity Games In this section we present the basic definitions and results 
for perfect-information stochastic games. 

Perfect-information stochastic games. A perfect-information stochastic game consists of a tuple G = 
(5,5i, 52, Ai, A 2 , (5^), where 5 is a hnite set of states partitioned into player-1 states (namely, 5i) and 
player-2 states (namely 52 ), Ai (resp., A 2 ) is the set of actions for player 1 (resp., player 2), and 5^ : 
(5i X Ai) U (52 X A 2 ) —)■ D(S) is a probabilistic transition function that given a player-1 state and player- 
1 action, or a player-2 state and a player-2 action gives a probability distribution over the successor states. If 
the transition function is deterministic (that is the codomain of 5*^ is 5 instead of V(S)), then the game is a 
perfect-information deterministic game. 

Plays and strategies. A play is an infinite sequence of state-action pairs (soaosiui • • •) such that for all y > 0 
we have that if Sj G Si for i G {1,2}, then Uj G A* and 5^{sj,aj){sj.^-i) > 0. A strategy is a recipe for 



a player to choose actions to extend finite prefixes of plays. Formally, a sfrafegy tt for player 1 is a function 
TT : S* ■ Si ^ V{Ai) fhaf given a finite sequence of visifed sfafes gives a probabilify disfribufion over fhe actions 
(fo be chosen nexf). A pure sfrafegy chooses a deterministic action, i.e., is a function vr : 5* • 5i ^ Ai. A pure 
memoryless sfrafegy is a pure sfrafegy fhaf does nof depend on fhe finite prefix of fhe play buf only on fhe currenf 
slate, i.e., is a funclion tt : 5i — Ai. The definilions for player-2 sfrafegies r are analogous. We denofe by If 
(resp., n™) fhe sef of all (resp., all pure memoryless) sfrafegies for player 1, and analogously F (resp., F™ for 
player 2). Given sfrafegies vr G If and r G F, and a slarling slale s, Ihere is a unique probabilify measure over 
evenls (i.e., measurable subsefs of S^), which is denoted as 

Finite-memory strategies. A pure player-1 sfrafegy uses finite-memory if if can be encoded by a Iransducer 
{Tl,mo,TTu,7rn) where 9JT is a finite sef (fhe memory of fhe sfrafegy), mo G Oft is fhe initial memory value, 
TTu : sot X 5 ^ sot is fhe memory-updafe function, and 7r„ : SOt ^ Ai is fhe nexf-acfion function. Nofe fhaf a 
finile-memory sfrafegy is a deterministic Iransducer wilh inpul alphabel S, oulpul alphabel Ai, where ttu is fhe 
delerminislic fransilion function, and vr^ is fhe oulpul labeling funclion. However, for finile-memory sfrafegies, 
since fhe inpul and oulpul is always fhe sef of sfafes and actions for player 1, for simplicily, we will represenl 
Ihem as a luple (SOt, mo, vr^, 7r„). The size of fhe sfrafegy is fhe number |9ft| of memory values. If fhe currenf slate 
is s, and fhe currenf memory value is m, Ihen fhe memory is updated lo m' = TTuim, s), and fhe sfrafegy chooses 
fhe nexf action 7r„(m'). Formally, (Oft, mo, vr^, vr^) defines fhe sfrafegy vr such fhaf Tr{p) = Trn{^u{rno, p)) for all 
p G S^, where tTu extends tt^ fo sequences of slates as expected. 

Parity objectives, almost-sure, and value problem. Given a perfect-information stochastic game, a parity 
objective is defined by an index function a on the state space. Given a strategy tt, the value of the strategy 
in a state s of the game G with parity objective denoted by val^(7r, <ha)(s), is the infimum of the 
probabilities among all player-2 strategies, i.e., val*^(7r, <hQ,)(s) = inf^gr ]Ps’’^(‘ha). The value of the game 
is val‘^(<ha)(s) = sup^gnA. strategy tt is almost-sure winning from s if val*^(7r, <ha)(s) = 1. 
The following theorem summarizes the basic results about perfect-information games. 

Theorem 3.1. The following assertions hold [14, 6, 9, 12, !].• 

1. (Complexity). The quantitative decision problem (of whether val*^(<hQ) > p, given rational p G (0, l])for 
perfect-information stochastic parity games lies in NP fl coNP (also UP PI coUP). 

2. (Memoryless determinacy). We have 

val'^(T>a)(s) = sup inf PJ’^(T>a) = inf sup P^’^(4 >q), 

tge™ vren 

i.e., the quantification over the strategies can be restricted to tt ^ jqPM .j- ^ pPM 


3.2 Complexity Results We first present the reduction for upper bounds. 


The upper-bound reduction. Consider a library L of width D, a non-blocking exit control relation R for L, 
and an index function a for L that defines the parity objective 4 >q. Let the number of components he k -\- 1, 
and let Mi = (S/, So, Qi, Qq, Si, Fi,Li) for 0 < i < k. Let us denote by [/c] = {0,1,2,... , k}. We define 
a perfect-information stochastic game Gc = {S, Si, S 2 , Ai, A 2 ,S^) with an index function ac as follows: 
^ = \fi=oiQ^ X {^}) U {A}, 5i = Uto(^i X {t}), 52 = 5 \ 5i, Ai = [k], and A 2 = S/. The state A 
is a losing absorbing state (i.e., a state with self-loop as the only outgoing transition and assigned odd priority by 
the index function ac), and the other transitions defined by the function S'f are as follows: (i) for s = {q, i) G S 2 , 
and fj G A 2 


^c{{(iu),<y){{q' g)) 


Si{q,(y){q') ifi = i 

0 otherwise 


(ii) for s = G Si and j G [k], we have that if {x,Mj) G R, then 5‘f{{ql.,i),j){{qQ,j)) = 1, else 

S^({ql., i),j)(l.) = 1. The intuitive description of the transitions is as follows: (1) Given a player-2 state that is 
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a non-exit state g in a eomponent Mi, and an aetion for player 2 that is an input letter, the transition funetion 
mimies the transition 6i of Mj; and (2) given a player-1 state that is an exit state g* in eomponent i, and an aetion 
for player 1 that is the ehoiee of a eomponent j, if {x, Mj) is allowed by R, then the next state is the starting state 
of eomponent j, and if the ehoiee (x, Mj) is invalid (not allowed by R), then the next state is the losing absorbing 
state _L. For all {q, i) £ S \ {_L} we have adiq, i)) = o:{q), and we denote by the parity objeetive in Gc- 
Note the similarity of the state spaee deseription in eomparison with Definition 1 for eontrolfiow eomposition. 
Correctness of reduction. There are two steps to establish eorreetness of the reduetion. The first step is given a 
eomposer for C to eonstruet a finite-memory strategy for player 1 in Gc- Intuitively, this is simple as a eomposer 
represents a strategy for a partial-observation game (Remark 1), whereas in Gc we have perfeet information. 
However, not every strategy in Gc can be eonverted to a eomposer (i.e., a perfeet-information game strategy 
eannot be eonverted to a partial-observation strategy). But we show that a pure memoryless strategy in Gc can be 
eonverted to a eomposer. While [21] treats the problem as a partial-observation game, our insight to eonvert pure 
memoryless strategies of the perfeet-information game Gc to eomposers allows us to establish the eorreetness 
with respeet to Gc- We present both the steps below. 

Lemma 3.1. Consider a library L of width D, a non-blocking exit control relation R for L, and an index 
function afar C that defines the parity objective <ha. Let Gc be the corresponding perfect-information stochastic 
game with parity objective ^aa- ^or all composers C, and the corresponding strategy -kq in Gc we have 
ys\{Tc,^J) = val'^^(7rc-,$aG)((g§,0)). 

Proof Composer to finite-memory strategies. Given a eomposer C = (D, £, Ad, Mq, A, A) for the library 
we define a finite-memory strategy vrc = (9R, mo, vr„, tt^) for the perfeet-information stoehastie game Gc as 
follows: (a) 9R = Ad and uiq = Mq; and (b) 7r„(Mi, s) = Mj for s G S 2 , and 7r„(Mj, {qi,j)) = for s G S'! if 
A(Mj, x) = M^, where A(Mj) = Mj-, and (e) 7r„(Mj) = j where A(Mj) = Mj. 

In other words, the finite-memory strategy has the same state spaee as the eomposer, and if the eurrent state 
is a player-2 state, then it does not update the memory state, and given the eurrent state is a player-1 state it 
updates it memory state aeeording to the transition funetion of the eomposer, and the aetion played is aeeording 
to the labeling funetion of the transdueer. In other words, the strategy -kq mimies the eomposer, and there is a 
one-to-one eorrespondenee between strategies for player 2 in the perfeet-information stoehastie game, and the 
strategies of the environment in Tc- Without loss of generality we eonsider that the eomposer must always start 
with the first eomponent (i.e., Mf) and henee the starting state is (gf], 0). This gives us the desired result. I 

For the seeond step we first eonsider valid pure memoryless strategies. 

Valid pure memoryless strategies in Gc- A pure memoryless strategy vr in Gc is valid if the following eondition 
holds: for all states G 5i if 7:{{ql.,i)) = j, then {x,Mj) G R, i.e., the ehoiees of the pure memoryless 

strategies respeet the exit eontrol relation. 

Lemma 3.2. Consider a library C of width D, a non-blocking exit control relation Rfor C, and an index function 
a for L that defines the parity objective <l>o. Let Gc be the corresponding perfect-information stochastic game 
with parity objective ^ao- ^or all valid pure memoryless strategies tt in Gc, and the corresponding composer 
Gt,, we have val(7c,,,^a) = val'^^(7r, T>„Q)((g[J, 0)). 

Proof. Valid pure memoryless strategies to composers. Given a valid pure memoryless strategy tt in Gc we 
define a eomposer Gt^ = {D, Ad, Mq, A, A) as follows: Ad = [k], Mq = 0, A(f) = Mi, and for 0 < f < /c and 
X £ D -we have fhat A{i, x) = j where TT{{ql.,i)) = j for gj, G F). In other words, for the eomposer there is a 
state for every eomponent, and given a eomponent and an exit state, the eomposer plays as the pure memoryless 
strategy. Note that sinee vr is valid, the eomposer obtained from tt is eompatible with the relation R. Note that 
the eomposer mimies the pure memoryless strategy, and there is a one-to-one eorrespondenee between strategies 
of player 2 in Gc and strategies of the environment in Tc,, ’ whieh establishes the result. I 



Lemma 3.3. Consider a library L of width D, a non-blocking exit control relation R for C, and an index 
function afar C that defines the parity objective <5q,. Let Gc be the corresponding perfect-information stochastic 
game with parity objective There exists an almost-sure composer iff there exists an almost-sure winning 

strategy in Gcfrom (gg, 0), and there exists an rj-optimal composer iff the value in Gc at ((/g, 0) is at least rj. 

Proof Sufficiency of valid pure memoryless strategies. Note that in the game Gc we ean restriet to only valid 
pure memoryless strategies heeause for a state s G 5i if an aetion is ehosen that is not allowed by a valid strategy, 
then it leads to the losing absorbing state _L. Thus in Gc if there is an almost-sure winning strategy (resp., a 
strategy to ensure that the value is at least p), then there is a valid pure memoryless strategy to ensure the same 
(Theorem 3.1). I 

Lemma 3.3 along with Theorem 3.1 imply that the qualitative and quantitative problems for the realizability 
for embedded parity lie in NP n eoNP (also UP n eoUP). We now present two related results. We first show 
that with unrestrieted exit eontrol relation the problem ean be solved in polynomial time, and then show that in 
general (i.e., with exit eontrol relation) the problem is at least as hard as solving perfeet-information deterministie 
parity games (for whieh no polynomial-time algorithm is known). 

The unrestricted exit control relation problem. For the unrestrieted exit eontrol relation problem, we modify 
the eonstruetion of the game Gc as follows: we add another state T that belongs to player 1, and for every state 
s G 5i and every aetion a G the next state is T, and from T player 1 ean ehoose any eomponent (i.e., state 
ioifa) iri Gc). We refer to the modified game as Gc. If the exit eontrol relation is unrestrieted then the result 
eorresponding to Lemma 3.3 holds for Gc. However, in Gc the number of memory less player-1 strategies is only 
k ->r 1 (one eaeh for the ehoiee of eaeh eomponent at T), and onee a memoryless strategy for player 1 is fixed we 
obtain an MDP whieh ean be solved in polynomial time (and the qualitative problem in strongly polynomial time 
by diserete graph theoretie algorithms) [16, 13, 10, 11]. Henee it follows that both the qualitative and quantitative 
realizability and synthesis problems for embedded parity with unrestrieted exit eontrol relation ean be solved in 
polynomial time. 

The hardness reduction. We now present a reduetion form perfeet-information deterministie parity games to 
the realizability problem for embedded parity. For simplieity we eonsider alternating games where the players 
make move in alternate turns, i.e., we eonsider a perfeet-information game G = {S, Si, S 2 , Ai, A 2 ,d^) where 
6^ is deterministie and is deeomposed into two funetions Sf : Si x Ai ^ S 2 and : S 2 x A 2 ^ Si 
(player-1 move leads to player-2 state and viee versa). A perfeet-information deterministie game with an index 
funetion a ean be eonverted to an equivalent alternating game with a linear blow-up by adding dummy states. 
Given an alternating perfeet-information game, let S 2 = {sg, sf, ..., and Si = {s}, s^,..., s^}. We 

eonstruet L of width Si, an exit eontrol relation R, and an index funetion a' as follows: (a) there are /c -|- 1 
eomponents Mq, Mi, ..., Mk one for eaeh state in S 2 ', (b) eaeh Mi = (S/, So, Qi, Qq, 5i, Fi, Lf) is defined 
as follows: (i) S/ = A 2 , (ii) So and Lj are not relevant for the reduetion, (iii) Qi = {qq, q\,..., qf} with 
Fi = Qi\ {qg}, and (iv) 5i{ql,a) = q] where s] = 5^{sl, a) for all a G S/. The exit eontrol relation R is 
as follows: R = {(f, Mj) \ 3ai G Ai. ai) = s^}. Intuitively, there exists a eomponent for eaeh state in 

S 2 , and the exit states for eaeh eomponent eorresponds to states of player 1. In the start state qg for eomponent 
Mi the transition funetion Si represents the transition funetion S 2 of the perfeet-information game, i.e., given 
an input letter a whieh is an aetion for player 2, if the transition given a is from sj to sj, then in Mi there is a 
eorresponding transition to q*. The exit eontrol relation represents the transitions for player 1. Sinee the exit states 
in eaeh eomponent is reaehed in one step, the eomposer strategies for the library represents perfeet-information 
strategies. The index funetion a' is as follows: a'{qlf = a{sl) and afqj) = «(«]) for j > 1. There exists an 
almost-sure winning strategy in G from Sg iff there exists an almost-sure eomposer for C with R. Also note that 
for the reduetion every eomponent transdueer is deterministie. 
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Theorem 3.2. (Complexity of embedded parity realizability) Consider a library C of width D , a 
non-blocking exit control relation R for C, and an index function a for L that defines the parity objective 
The following assertions hold: 

1. The realizability problem belongs to NP PI coNP (also UP H coUP), and is at least as hard as the (almost- 
sure) decision problem for perfect-information deterministic parity games. 

2. If R is an unrestricted exit control relation, then the realizability and synthesis problems can be solved in 
polynomial time 

4 The Complexity of Realizability for DPW Specifications 

In this section we present three results. First, we present a new result for partial-observation stochastic parity 
games. Second, we show that the qualitative realizability problem for DPW specifications can be reduced to our 
solution for partial-observation stochastic games yielding an EXPTIME-complete result for the problem. Einally, 
we show that the quantitative realizability problem for DPW specifications is undecidable. 

4,1 Partial-observation Stochastic Parity Games In this section we consider partial-observation games with 
various restrictions on strategies and present a new result for a class of strategies that correspond to the analysis 
of the qualitative realizability problem. 

Partial-observation stochastic games. In a stochastic game with partial observation, some states are not 
distinguishable for player 1. We say that they have the same observation for player 1. Eormally, a partial- 
observation stochastic game consists of a stochastic game G = {S, Si, S 2 , Ai, A2,5^), a finite set O of 
observations, and a mapping obs : S ^ O that assigns to each state s of the game an observation obs(s) 
for player 1 . 

Observational equivalence and strategies. The observation mapping induces indistinguishability of play 
prefixes for player 1 , and fherefore we need fo consider only fhe player -1 sfrafegies fhaf play in fhe same way 
after fwo indistinguishable play prefixes. We consider fhree differenf classes of sfrafegies depending on fhe 
indisfinguishabilify of play prefixes for player 1 and fhey are as follows: (i) fhe play prefixes have fhe same 
observation sequence; (ii) fhe play prefixes have fhe same observation sequence, excepf fhaf fhe lasf observafion 
may be repeafed arbifrarily many fimes; and (hi) fhe play prefixes have fhe same sequence of disfincf observafions, 
fhaf is fhey have fhe same observafion sequence up fo repefifion (sfuffering). We now formally define fhe classes 
of sfrafegies. 

Classes of strategies. The observation sequence of a sequence p = sqSi .. .Sn is the sequence obs(/ 9 ) = 
obs(so) • • • obs(sn) of state observations; the collapsed stuttering of p is the sequence obs(/ 9 ) = O 0 O 1 O 2 ... of 
distinct observations defined inductively as follows: oq = obs(so) and for all f > 1 we have Oj = obs(sj) 
if obs(si) 7 ^ obs(si_i), and Oi = e otherwise (where e is the empty sequence). We consider three types of 
strategies. A strategy tt for player 1 is 

• observation-based if for all sequences p, p' G 5"'' such that last(/ 9 ) G Si and last(/)') G Si, if 
obs(p) = obs(/)') then tt{p) = ^{p'y, 

• observation-based stutter-invariant if it is observation-based and for sequences p G S^, for all states 
s G 5, if obs(s) = obs(last(/ 9 )), then 7 r(p) = 7 r(/ 9 s); 

• observation-based collapsed-stutter-invariant if for all sequences p, p' G S^ such that last(/ 9 ) G Si and 
last(p') G Si, if obs(p) = obs(p'), then ^{p) = 'k{p'). 


The key difference between observation-based stutter-invariant and observation-based collapsed-stutter- 
invariant strategies is as follows: they both must play the same action while an observation is repeated, however. 



b 



Figure 1: Reachability game where there is no collapsed-stuttering invariant winning strategy, but there is a 
observation-based stutter-invariant winning strategy. States {si, S 2 , S 3 } are indistinguishable for player 1, as well 
as {ss, se}. The objective for player 1 is to reach 57 . 


collapsed-stutter-invariant strategies cannot observe the length of the repetitions of an observation, whereas a 
stutter-invariant strategy can. In the following example we illustrate this difference. 

Example. Figure 1 shows a (non-stochastic) reachability game where the objective of player 1 is to reach 57 . 
The states si, S 2 , S3 have the same observation (for player 1), and the states S5, se as well. The game starts in a 
player-2 state sq with successors si and S 2 - Thus after one step, player 1 does not know whether the game is in si 
or in S 2 - If the game is in S 2 , then the action b leads to S4 from where the target state 57 is not reachable. Hence 
playing b is not a good choice. Playing a gives either a new observation (se is reached), or the same observation 
as in the previous step (53 is reached). At this point, a simple observation-based strategy can play b and reach 
S 7 for sure. However, an observation-based stutter-invariant (or collapsed-stutter-invariant) strategy must keep 
playing a in S 3 (since the observation did not change) and reaches 55 with same observation as in 55 . Now, an 
observation-based stutter-invariant strategy wins by playing a in 55 and 6 in sg. Indeed with observation-based 
stutter-invariant strategies, player 1 can distinguish whether the game is in 55 or in sg (simply looking at the 
length of the play prefix in this case). However, with an observation-based collapsed-stutter-invariant strategy, 
player 1 cannot win because the play prefixes pi = sq ... and P 2 = sq ... sq have fhe same collapsed sfuffering 
sequence of observations, fhus forcing player 1 fo choose fhe same acfion a or b, and acfion a is losing for p 2 and 
action b for pi. Hence in fhis game fhere is no observation-based collapsed-sfuffer-invarianf winning sfrafegy, buf 
fhere is an observation-based sfuffer-invarianf winning sfrafegy. 

The previous example shows fhaf collapsed-sfuffer-invarianf sfrafegies are differenf from sfuffer-invarianf, 
as well as from sfandard observation-based sfrafegies. Our goal is fo decide fhe exisfence of finife-memory 
almosf-sure winning sfrafegies in parfial-observafion stochastic parify games. This problem has been sfudied 
for observafion-based sfrafegies and opfimal complexify resulf (EXPTIME-complefeness) has been esfablished 
in [ 8 ]. We now presenf a polynomial-fime reduction for deciding fhe exisfence of finife-memory almosf-sure 
winning collapsed-sfuffer-invarianf sfrafegies fo observafion-based sfrafegies. 

Reduction of collapsed-stutter-invariant problem to observation-based problem. There are two main ideas 
of the reduction. (1) The first is that whenever player 1 plays an action a, the action a is stored in the state space 
as long as the observation of the state remains the same. This allows to check that player 1 plays always the same 
action along a sequence of identical observations. This only captures the stutter-invariant restriction, but not the 
collapsed-stutter-invariant restriction. (2) Second, whenever a transition is executed, player 2 is allowed to loop 
arbitrarily many times through the new state. This ensures that player 1 cannot rely on the number of times he 
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in {s, x), if X G Ai, then player 1 should play the stored action x; 

if X = 0, no action is stored and player 1 can choose any action. 
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player 2 can play all actions available in the original game, and 
repeat arbitrarily many times the current observation. 


Figure 2: Game transformation for the reduction of collapsed-stutter-invariant problem to observation-based 
problem. 


sees an observation, thus that player 1 is collapsed-stutter-invariant. However, it should be forbidden for player 2 
to loop forever in a state, which can be ensured by assigning priority 0 to the loop (hence player 1 would win the 
parity objective if the loop is taken forever by player 2). We now formally present the reduction. 

The formal reduction. The reduction is illustrated in Figure 2 and formally presented below. Given a partial- 
observation stochastic game G = {S, Si, S 2 , Ai, A 2 ,S^) with observation mapping obs : 5 —0, we construct 
a game G' = {S', S'j, ^ 1 , ^ 2 ) follows: 

• S' = S X {Ai U yli U {0,0}) U {_L} where Ai = {a \ a ^ ^ 1 }, assuming that 0 0 Ai. The states (s, 0) 
are a copy of the state space of the original game, and in the states {s, a) with s £ Si and a £ Ai, player 1 
is required to play action a; in the states {s, 0) and {s,a), player 2 can stay for arbitrarily many steps. The 
state _L is absorbing and losing for player 1. 

. = 5i X (^1 U {0}) U {T}. 

• S'2 = S'\S'^ = {S 2 X (^1 U {0})) U (5 X (:4i U {0})). 

• ^2 = ^2 U {tj}, assuming (t 0 A 2 . 

• The probabilistic transition function 6^^' is defined as follows: for all player-1 states {s, x) £ and actions 
a £ Ai. 

- if X G \ {a}, then let 5'^' {{s, x), a))(_L) = 1, that is player 1 loses the game if he does not play 
the stored action; 

- if X = a or X = 0, then for all s' £ S' let 

5'^'{{s, x), a)){{s',a)) = S'^{s, a){s') if obs(s') = obs(s), and let 

, x),a)){{s', 0)) = S'^{s, a){s') if obs(s') / obs(s); thus we store the action a as long as the 
state observation does not change; 











- All other probabilities 5^'{{s,x),a)){-) are set to 0 , for example 6^'{{s,0),a)){{s',y)) = 0 for all 
y^a\ 

and for all player-2 states (s, x) G and aetions a G A2: 

- if X G Ai U { 0 }, then for all s' G S' let 

x), a))((s', x)) = 5*^(5, a)(s') if obs(s') = obs(s), and let 
5 «'((s , x), a))((s', 0)) = a){s') if obs(s') 7^ obs(s); thus all aetions are available to player 2 

as in the original game, and the stored aetion x of player 1 is maintained if the state observation does 
not ehange; 

- if X = 6 for some 6 G Ai U { 0 }, then let 
( 5 '^'((s, 6 ),tt))((sJ)) = 1 , and 

5^'{{s, b),a)){{s, 6)) = 1 if a 7^ JJ; thus player 2 ean deeide to stay arbitrarily long in (s, 6) before 
going baek to (s, b); 

- All other probabilities ((s, x), a))(-) and 5 *^^ ((s, x), ([))(•) are set to 0 . 

The observation mapping obs' is defined aeeording to the first eomponent of the state: obs'((s, x)) = obs(s). 
Given an index funetion a for G, define fhe index funefion a' for G' as follows: a'((s,x}) = a(s) and 
a'((s,x)) = 0 for all s G 5 and x G Ai U { 0 }, and a'(-L) = 1 . Henee, fhe slate _L is losing for player 1 , 
and fhe player-2 slales (s, x) are winning for player 1 if player 2 slays Ihere forever. 

Lemma 4 . 1 . Given a partial-observation stochastic game G with observation mapping obs and parity objec¬ 
tive <hQ, defined by the index function a, a game G' with observation mapping obs' and parity objective ^a' 
defined by the index function a' can be constructed in polynomial time such that the following statements are 
equivalent: 

• there exists a finite-memory almost-sure winning observation-based collapsed-stutter-invariant strategy tt 
for player 1 in G from sofor the parity objective 

• there exists a finite-memory almost-sure winning observation-based strategy tt' for player 1 in G' from 
(sO) 0)/or the parity objective ^a'- 

Correctness argument. The eorreefness proof has two direetions: first, given a finite-memory almost-sure 
winning eollapsed-stutter-invariant strategy in G to eonstruet a similar witness of observation-based strategy in 
G', and viee versa (the seeond direetion). We present both direetions below. 

First direction. For the first direetion, given a finite-memory almost-sure winning eollapsed-stutter-invariant 
strategy tt for player 1 in G, we eonstruet a finite-memory observation-based strategy tt' for player 1 in G' sueh 
that tt' is almost-sure winning in G'. 

To define 7r'(p') for a play prefix p' in G' sueh that last(/9') G S'^, we eonstruet a play prefix p in the original 
game G and define ir'lp') = 7r(p). We eonstruet p as p{p') where ^ is a mapping that first removes from p' all 
states of the form {s,x) for s G 5 and x G Ai U { 0 }, and then projeets all the other states {t, ■) to their first 
eomponent t. It follows that p = p{p') is a play prefix in G and if the observation sequenees of two prefixes 
p'l, p '2 in G' are the same (i.e., obs'(/)'j^) = obs'(/92)). then the eollapsed stuttering of p{p'i) and is also the 
same. It follows that the eonstrueted strategy tt' is well defined, and sinee tt is eollapsed-stutter-invariant, tt' is 
observation-based. 

We show that tt' is almost-sure winning for the parity objeetive in G' . Consider an arbitrary strategy r' 

for player 2 in G' . We ean assume without loss of generality that: 
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• t' is pure since when strategy tt' is fixed in G', we get a 1-player stochastic games, and pure strategies are 
sufficient in 1-player stochastic games [7]; 

• no play compatible with vr' and r' gets stuck in a state of the form {s,x) for s G S and xGyliU{0} 
(i.e., no play loops forever through a self-loop on some state (s, x) after some prefix p'y, fhis assumpfion is 
also wifhouf loss of generalify because if t' is a spoiling sfrafegy (i.e., if ensures againsf tt' fhaf fhe parify 
objecfive is satisfied wifh probabilify less fhan 1) and a play gels sluck after some prefix p', Ihen we can 
define a sfrafegy t” lhal plays arbifrarily after p' buf does nol gel sluck, i.e., never plays [J, since gelling 
sluck forever implies winning for player 1. If follows lhal r" is also a spoiling sfrafegy and never gels 
sluck. 

From fhe sfrafegy t' we define a sfrafegy r for player 2 in G (lhal basically mimics t' ignoring fhe jj aclions). 
If follows by induction lhal (up lo fhe mapping p) fhe probabilify measure over plays in G under slralegies tt 
and r coincides wifh fhe probabilify measure over plays in G' under slralegies vr' and t' . Since p preserves fhe 
satisfaction of fhe parify objecfive, if follows lhal wifh probabilify 1 fhe parify objecfive is salisfied in G' under 
tt' and r', and Ihus tt' is an almosl-sure winning observation-based sfrafegy in G'. 

Second direction. For fhe second direction of fhe lemma, consider lhal Ihere exisls a finile-memory al most-sure 
winning observation-based strategy vr' for player 1 in G' from state (sq, 0) for the parity objective and we 
show that there exists an observation-based collapsed-stutter-invariant almost-sure winning observation-based 
strategy for player 1 in G from sq- 

Let (fUt', ttIq, 7 r(j, vr^) be a transducer that encodes vr' (thus 971' is finite). We construct a transducer 
(971, mo, 7r„, vr„) and show that it encodes an observation-based collapsed-stutter-invariant strategy tt that is 
almost-sure winning in G from sq for tho parity objective <1 >q,. Intuitively, given a memory value m' G 971' and 
a sequence of states with identical observation o visited in the game G', the memory will be updated (according 
to 7r(j) and the actions played (according to tt'^) may be different depending on the number of repetitions of the 
observation o. To construct a collapsed-stutter-invariant strategy, we update the memory to a value that occurs 
infinitely often in the sequence of memory updates obtained when observing o repeatedly. This ensures that, 
as long as the observation does not change, the constructed strategy plays always the same action. Moreover, 
this action could indeed be played by the original strategy (even after an arbitrarily long sequence of identical 
observations). The transducer for vr is defined as follows: 

• 971 = 971' X O. A memory value m = (m', o) corresponds lo memory value m' in fhe fransducer for tt', 
and fhe currenl observation is o. As long as fhe nexf observation is o, fhe memory value m does nol change 
(Ihere is a self-loop on m for all inpuls s such lhal obs(s) = o). 

• mo = (m', obs(so)) where m' is such lhal m' = 7r(j(mQ, Sq) for infinitely many n, and Sq = sqSo ... sq is 

^ V 

n times 

fhe n-fold repelilion of sq (such m' always exisls since 971' is finite); note lhal in fhe definition of m' we 
can equivalenlly replace Sq by s” for any s such lhal obs(s) = obs(so) since fhe sfrafegy tt' is observation- 
based. 

• For all (m[,oi) G 971 and s G S', if obs(s) = oi, Ihen 7ru((m[,oi), s) = (m[,oi) (self-loop), and if 
obs(s) / oi, Ihen 7ru((m[,oi), s) = (m' 2 , obs(s)) where m^ = 7r^(m[,s^) for infinitely many n. 

• 7r„((m',o)) = <(m'). 

Firsl, we show lhal fhe sfrafegy vr is almosl-sure winning in G from sq. The proof of Ibis claim is by 
conlradiclion: assume lhal vr is nol almosl-sure winning. Then Ihere exisls a spoiling sfrafegy r for player 2 such 
lhal fhe parity objective is salisfied wifh probabilify less fhan 1. From r, we define a sfrafegy r' for player 2 



in G' intuitively as follows: the strategy r' mimies the strategy r when the eurrent state is of the form (s, x) 
where s € 52 and a; G vli U {0}, and ensures the following invariant: given any prefix p' in G', the memory 
value of vr' after p' is m' if and only if the memory value of vr after p{p') is of the form {m', ■), where p, is the 
mapping defined in fhe firsf direefion of fhe proof. Player 2 ean always ensure fhis invarianf as follows: given 
fhe memory value of tt is updated fo {m', o), repeal fhe self-loop on aelion (j suffieienlly many limes lo lei Ihe 
memory value of vr' be updated lo m', whieh is always possible sinee by definition of Ihe Iransdueer for vr, Ihe 
value m' is “hil” infinitely often in Ihe Iransdueer for tt' when a slate wilh observation o is visited forever. For 
example, Ihe slralegy r' slays in Ihe initial slate (sq, 0), whieh is a player 2 slate, for n steps where n is such 
lhal m' = 7r'^(mQ, Sq) where m' is such lhal (m', obs(so)) is Ihe initial memory value of vr. If follows from Ibis 
definition of r' lhal for all play prefixes p' in G' lhal are compatible wilh tt' and t' in G', Ihe play prefix p(p') is 
compatible wilh tt and r and has Ihe same probabilily as p'. Therefore Ihe respective probabilily measures in G 
and in G' coincide (up lo Ihe mapping p) and since for all infinite plays pin G and p' in G' such lhal p = p{p'), 
we have p G $ 0 , if and only if p G it follows lhal r' is a spoiling slralegy in G (Ihe parity objective is 
satisfied wilh probability less lhan 1 under slralegies tt' and t'). This conlradicls lhal tt' is almosl-sure winning. 
Hence Ihe original claim lhal Ihe slralegy tt is almosl-sure winning in G holds. 

Second, we show lhal Ihe slralegy tt is observation-based collapsed-sluller-invarianl. We have by induction 
lhal TTu{mo, p) = TTu{mo,p ■ s) for all play prefixes p and slates s such lhal obs(s) = obs(last(p)), and il follows 
lhal 7r(p) = 7r(p') if obs(p) = obs(p') which concludes Ihe argument 

Since solving partial-observation stochastic parity games wilh finite-memory observation-based slralegies is 
EXPTIME-complele for almosl-sure winning [8], we gel an EXPTIME upper bound for games wilh observation- 
based collapsed-sluller-invarianl slralegy by Ihe reduction in Eemma 4.1. The same complexity resulls hold 
for Ihe class of observation sluller-invarianl slralegies, by removing all ft-labelled self-loops for player 2 in Ihe 
reduction for collapsed-sluller-invarianl slralegies. We note lhal an EXPTIME lower bound can be eslablished 
for Ihose problems by a converse reduction lhal inlroduces in every Iransilion an intermediate dummy slate wilh a 
differenl observation, Ihus Iwo consecutive observations are always different and Ihe observation-based slralegies 
are also collapsed-sluller-invarianl. 

Theorem 4.1. The qualitative problem of deciding whether there exists a finite-memory almost-sure winning 
observation-based collapsed-stutter-invariant (or stutter-invariant) strategy in partial-observation stochastic 
games with parity objectives is EXPTIME-complete. 


4.2 The Complexity of Qualitative Realizability We now present Ihe complexity result for qualitative 
realizability for DPW specifications via a reduction to the problem of deciding the existence of finite-memory 
almost-sure winning collapsed-stutter-invariant strategies in partial-observation games. The reduction formalizes 
the intuition described in Remark 1. 

Reduction of synthesis for DPW specifications to collapsed-stutter-invariant problem. The reduction is 
analogous to the upper-bound reduction presented in Section 3.2. Given a library L of width D and a DPW, the 
game we construct is the product of the game Gn with the DPW. The states are of the form (q, i, p) where (q, i) is a 
state of Gc and p is a state of the DPW. The third component p is updated according to the deterministic transition 
function 5p of the DPW. Thus the successors of (q, i,p) are of the form (•, -^p') where p' = 5p{p, Lfiq)). Eor 
example, the transitions for player-2 states s = (q, i,p), and actions a G yl2 are defined by 




^q, <y){q') if« = J and p' = 6p{p, Lfiq)) 
0 otherwise 


The index function in the game is defined according to the third component of the states and according 
to the index function ap of the DPW, thus aG{{q,i,p)) = ap{p). The observation mapping is defined by 
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obs{{q,i,p)) = iif q ^ Fi, and obs{{q,i,p)) = {q,i) if g € Fj. Thus the state of the DPW is not observable, 
and only the eomponents name and exit states are observable. Note that O = [k]\J U^=o(-^* ^ ‘t^J’) where 
[k] = {0,1, 2,, k} and the number of eomponents is /c + 1. The eorreetness argument is established using 
similar arguments as in Seetion 3.2, by showing that a eomposer for L ean be mapped to a eollapsed-stutter- 
invariant strategy in Gc that is almost-sure winning, and viee versa we ean eonstruet a eomposer for L from an 
almost-sure winning eollapsed-stutter-invariant strategy in Gc. by the inverse mapping. 

This reduetion and Theorem 4.1 show that the realizability problem with DPW speeifieations ean be solved 
in EXPTIME, and an EXPTIME lower bound is known for this problem [2]. 

Theorem 4.2. The qualitative realizability problem for controlflow composition with DPW specifications is 
EXPTIME-complete. 

4.3 Undecidability of the Quantitative Realizability In this seetion we establish undeeidability of the 
quantitative realizability problem by a reduetion from the quantitative deeision problem for probabilistie automata 
(whieh is undeeidable). 

Probabilistic automata. A probabilistic automaton A = {T,,Q,qQ,6) is a probabilistic ttansducer without 
outputs and exit states, i.e., S consists of the input letters, Q is the finite state space with initial state qo, and 
6 : Q X X ^ D{Q) is the probabilistic transition function. Consider a probabilistic automaton A with an 
index function aonQ. A word is an infinite sequence of letters from S, and a lasso-shaped word w = wi{w 2 )^ 
consists of a finite word wi followed by an infinite repetition of a non-empty finite word W 2 - Given a probabilistic 
automaton A with index function a, the quantitative decision problem of whether there exists a lasso-shaped 
word that is accepted with probability at least rj is undeeidable, for rational rj G (0,1) given as input [22], and 
the undeeidability proof holds even for index function for reachability, Biichi, or coBiichi objectives. 

The key ideas of reduction. The key ideas of the reduction are as follows. We consider a component for each 
letter of the input alphabet, and each component has a unique exit (i.e., \D\ = 1). Hence a composer represents 
a choice of word, and since the state space of a composer is finite, a composer represents a lasso-shaped word. 
Conversely, for every lasso-shaped word there is a composer. In each component, in the starting state, there is a 
choice by the environment among the states of the probabilistic automaton, and given the choice of a state, the 
probabilistic transition is executed according to the current state and choice of letter (represented by the choice 
component), and finally there is a transition to the unique exit state. To ensure that the choice in each component 
really chooses the correct state of the probabilistic automaton we use the DPW. The DPW keeps track of the 
current state of the probabilistic automaton, and requires that the choice from the starting state of the component 
matches the current state (otherwise it accepts immediately). 

The reductiou. Consider a probabilistic automaton A = {'E,Q,qo,5) with an index function a on Q. Eet 
S = {0,1, 2,... , k}. We construct a library E with k -\- 1 components Mq, Mi, ..., Mk each with a unique exit 
as follows. We have Mi = {Ej, Eq, Qi, q^, di, Fi,Li) where the components are as follows: 

1. Ej = Q; and Eq = Q U {$}; 

2- Qi = {qh} U (Q X {1, 2}) U {exj; 

3. Fi = {exj; 

4. Lfiql) = Lfiexi) = $ and Li{{q,j)) = q for q^Q and j G {1,2}; and 

5. (a) 6i{qQ,a){{a, 1)) = 1 (i.e., given the start state and an input letter a = q corresponding to a state of 
Q, the next state is {q, 1)); (b) Si{{q, l),a){{q',2)) = S{q,i){q') (i.e., irrespective of the input choice a, 
the second component changes from 1 to 2, and the first component changes according to the transition 
function 5 of ^ for the choice of input letter i); and (c) di{{q, 2},a)(exi) = 1 (i.e., irrespective of choice 
of a the next state is the unique exit state). 

The DPW. We now describe the DPW along with the library of components. The DPW has alphabet Q U {$} 
and state space (Q x (0,1, 2}) U {T}. The transition function Sp is as follows: 



1. 6p{{q,0),a) = (g, 1) if a = q, else if a ^ q, then (5((g, 0),cj) = T (i.e., in a state where the seeond 
eomponent is 0 the automaton expeets to read the same input as the first eomponent, and if it reads so it 
ehanges the seeond eomponent from 0 to 1, otherwise it goes to the T state). This step eorresponds to 
reading the ehoiee from the start state of a eomponent. 

2. 5p{{q, 1), (t) = (cj, 2) (i.e., when the seeond eomponent is 1 it updates the first eomponent aeeording to 
the transition read, and the seeond eomponent ehanges from 1 to 2). This step eorresponds to reading the 
transition in a eomponent that mimies the transition of the probabilistie automaton. 

3. 5p{{q, 2), a) = {q, 0) (i.e., irrespeetive of the input, the first eomponent remains the same, and the seeond 
eomponent changes from 2 to 0). This step corresponds to reading the transition to the exit state in a 
component. 

To be very precise, one also needs to add more states in the DPW for transition from the exit state of a component 
to the start state of the next component (which is omitted for simplicity). The state T is an absorbing state 
(irrespective of the input the next state is T itself). The index function ap for the DPW maps according 
to the index function of a and the first component, i.e., for {q,i) where q ^ Q and i G {0,1,2} we have 
ap{{q, i)) = a{q); and ap(T) = 0. 

Correctness argument. Given the probabilistic automaton A with index function a, let be the corresponding 
parity objective. For a lasso-shaped word w, let P“'(<I>o) denote the probability that the word satisfies fhe 
parify objecfive. Given a composer fhaf corresponds fo a lasso-shaped word w, consider a sfrafegy of fhe 
environmenf fhaf given fhe sfarfing sfafe of a componenf always chooses fhe slate according lo fhe firsl componenl 
of fhe currenl sfafe of fhe DPW. Then if follows fhaf fhe probabilify disfribufion of A is execuled, and hence we 
have P“'(^>q) < val(7b™,where is the parity objective induced by the DPW. Conversely, if the 
environment does not choose according to the current state of the DPW, then the DPW immediately accepts. It 
follows that P“'(<hQ) > val(7c„, ^hop), and thus we have P"'(<hQ,) = val(7c„, ‘hop). Hence the answer to the 
quantitative realizability problem is YES iff the answer to the quantitative decision problem for A is YES. We 
have the following result. 

Theorem 4.3. The quantitative realizability problem for controlflow composition with DPW specifications is 
undecidable. 
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